[ENH] Add config validation with descriptive error messages by Saurabh6266 · Pull Request #529 · mllam/neural-lam

Saurabh6266 · 2026-03-27T23:18:35Z

Summary

Closes #528

Adds a validate_config() function to neural_lam/config.py that checks for common config problems before training or graph creation begins, and raises InvalidConfigError with clear, actionable messages instead of raw tracebacks.

Motivation

Two common config errors currently produce hard-to-debug tracebacks:

datastore.config_path pointing to a non-existent file causes a deep FileNotFoundError inside init_datastore, with no indication of which config field is wrong.
ManualStateFeatureWeighting with an empty weights dict produces a silent no-op during training rather than an early error.

Both are now caught at startup with a message pointing directly to the relevant field and showing an example fix.

Changes

neural_lam/config.py: Added validate_config(config, config_path) called inside load_config_and_datastore after YAML parsing
tests/test_config.py: Added five tests for validate_config covering existing path (passes), missing path (raises with correct field name), error message content, empty ManualStateFeatureWeighting (raises), and non-empty ManualStateFeatureWeighting (passes). Existing tests untouched.

Why this matters for future development

The probabilistic forecasting project and other extensions will introduce new required config fields. Having a single validation function makes it straightforward to add new checks without risk of new fields producing cryptic runtime errors deep in the training loop.

Testing

pytest tests/test_config.py -v
pre-commit run --all-files

Checklist

Tests pass locally
pre-commit passes
Existing tests unaffected

- Added validate_config() to neural_lam/config.py - Checks datastore.config_path exists on disk before training starts - Checks ManualStateFeatureWeighting.weights is non-empty - Raises InvalidConfigError with actionable message pointing to the field - Added 5 tests to tests/test_config.py covering all validation paths Closes mllam#528

princekumarlahon · 2026-03-28T17:26:12Z

This is a really nice improvement catching config issues early with clear error messages will make things much easier to debug, especially for new users. I also like that validation happens right after loading the config and that you added targeted tests for the new cases. I had a few thoughts mainly around how this might scale as more config options get added:

Right now all the checks seem to live inside validate_config. That works well for now, but I wonder if it might get a bit crowded over time. Maybe grouping validations by sections (like datastore, training, etc.) could help keep things organized as it grows.
It might also be worth standardizing the error messages a bit more for example, always including the full config path (training.state_feature_weighting) and keeping a consistent format for suggested fixes.
Since this is meant to support future additions (like probabilistic forecasting configs), it could be useful to have a clear pattern for adding new validation rules, so contributors don’t have to modify one large function every time.

Out of curiosity, do you plan to expand validation to more fields over time, or keep it focused on the most common failure cases? Overall though, this is a great step toward making config errors much more user-friendly.

Saurabh6266 · 2026-03-29T09:06:39Z

Thanks @princekumarlahon, really appreciate the detailed feedback!

These are all good points. To respond to each:

Grouping by section — agreed that a flat validate_config will get crowded as more fields are added. One approach would be breaking it into _validate_datastore and _validate_training helpers called from the main function, so each section stays self-contained. Happy to restructure it that way in a follow-up commit if that direction makes sense to the maintainers.
Consistent error message format — makes sense. Right now the path format is consistent (datastore.config_path, training.state_feature_weighting) but the suggested fix wording varies slightly between checks. I can tighten that into a fixed template — something like [field]: <what's wrong>. Example fix: <yaml snippet> — so all messages follow the same structure.
Extensibility pattern — the intent was exactly this: new fields get registered in one place with their error description, rather than buried in ad-hoc conditionals. If the current structure isn't clear enough on that, I can add a short comment above the validation calls explaining the pattern.

On your question about scope: the plan is to keep the initial PR focused on the two most failure-prone cases (missing datastore file, empty manual weights) and expand incrementally as new config fields are introduced — especially the ones needed for the probabilistic forecasting extension. That way each addition stays reviewable and testable on its own.

Let me know if you'd like me to address any of these before review, or if it's better to handle them as follow-up PRs.

princekumarlahon · 2026-03-29T14:29:07Z

This sounds great I like the direction. Splitting validation into section-based helpers and standardizing error messages both make sense. And +1 to keeping this PR focused and handling structural improvements as follow up. I think this is in a good place

kshirajahere mentioned this pull request Mar 28, 2026

[ENH] Improve config validation error messages for missing or invalid fields #528

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ENH] Add config validation with descriptive error messages#529

[ENH] Add config validation with descriptive error messages#529
Saurabh6266 wants to merge 1 commit intomllam:mainfrom
Saurabh6266:enh/config-validation-error-messages

Saurabh6266 commented Mar 27, 2026 •

edited

Loading

Uh oh!

princekumarlahon commented Mar 28, 2026

Uh oh!

Saurabh6266 commented Mar 29, 2026

Uh oh!

princekumarlahon commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Saurabh6266 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Why this matters for future development

Testing

Checklist

Uh oh!

princekumarlahon commented Mar 28, 2026

Uh oh!

Saurabh6266 commented Mar 29, 2026

Uh oh!

princekumarlahon commented Mar 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Saurabh6266 commented Mar 27, 2026 •

edited

Loading